Information about the test:
| question | description |
|---|---|
| A1 | linear function |
| A2 | 3D |
| A3 | arithmetic rules |
| A4 | Easy ineq. |
| A5 | logs |
| A6 | logs |
| A7 | graph translation |
| A8 | sine pi/3 |
| A9 | trig.ineq. |
| A10 | trig.identity |
| A11 | period |
| A12 | rational exponents |
| A13 | ellipsoid |
| A14 | limit |
| A15 | series |
| A16 | diff.quotient |
| A17 | graph f' |
| A18 | product rule |
| A19 | quotient rule |
| A20 | (exp)' |
| A21 | (ln(sin))' |
| A22 | hyp.functions |
| A23 | slope tangent |
| A24 | IVT |
| A25 | velocity |
| A26 | int(poly) |
| A27 | int(exp) |
| A28 | Int = 0 |
| A29 | int even funct |
| A31 | int(abs) |
| A32 | FtoC algebra |
| A33 | difference vectors |
| A35 | intersect planes |
| A36 | parallel planes |
| A30 | FtoC graph |
| A34 | normal vector |
Load the student scores for the test - here we load the 2017 and 2018 ETH Zurich test data:
test_scores
## # A tibble: 3,433 x 38
## year class A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11
## <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 2017 s21t~ 1 0 1 1 1 0 1 0 1 0 1
## 2 2017 s21t~ 1 0 1 1 1 1 0 1 1 2 1
## 3 2017 s21t~ 1 0 0 0 1 1 1 0 1 1 1
## 4 2017 s21t~ 1 0 1 1 1 1 1 1 1 1 1
## 5 2017 s21t~ 1 0 1 0 2 0 1 0 1 0 2
## 6 2017 s21t~ 0 1 0 0 1 2 0 2 2 2 1
## 7 2017 s21t~ 1 0 1 0 2 1 0 1 2 1 0
## 8 2017 s21t~ 1 1 1 1 2 1 1 2 2 2 2
## 9 2017 s21t~ 1 1 0 1 1 1 1 1 1 1 1
## 10 2017 s21t~ 1 0 1 0 0 1 0 0 1 1 1
## # ... with 3,423 more rows, and 25 more variables: A12 <dbl>, A13 <dbl>,
## # A14 <dbl>, A15 <dbl>, A16 <dbl>, A17 <dbl>, A18 <dbl>, A19 <dbl>,
## # A20 <dbl>, A21 <dbl>, A22 <dbl>, A23 <dbl>, A24 <dbl>, A25 <dbl>,
## # A26 <dbl>, A27 <dbl>, A28 <dbl>, A29 <dbl>, A30 <dbl>, A31 <dbl>,
## # A32 <dbl>, A33 <dbl>, A34 <dbl>, A35 <dbl>, A36 <dbl>
The data includes scores of 2 for the “I don’t know” answer option. We replace these with 0, reflecting a non-correct answer:
test_scores <- test_scores %>%
mutate(across(starts_with("A"), ~ ifelse(. == 2, 0, .)))
The number of responses from each class:
test_scores %>%
group_by(year, class) %>%
tally() %>%
gt() %>%
data_color(
columns = c("n"),
colors = scales::col_numeric(palette = c("Blues"), domain = NULL)
)
## Warning: The `.dots` argument of `group_by()` is deprecated as of dplyr 1.0.0.
| class | n |
|---|---|
| 2017 | |
| s21t-000-2017 | 1682 |
| 2018 | |
| s21t-000-2018 | 1751 |
Mean and standard deviation for each item:
test_scores %>%
select(-class) %>%
group_by(year) %>%
skim_without_charts() %>%
select(-contains("character."), -contains("numeric.p"), -skim_type) %>%
rename(complete = complete_rate) %>%
# make the table wider, i.e. with separate columns for each year's results, with the year at the start of each column name
pivot_wider(names_from = year, values_from = -c(skim_variable, year), names_glue = "{year}__{.value}") %>%
# put the columns in order by year
select(sort(names(.))) %>%
select(skim_variable, everything()) %>%
# use GT to make the table look nice
gt(rowname_col = "skim_variable") %>%
# group the columns from each year
tab_spanner_delim(delim = "__") %>%
fmt_number(columns = contains("numeric"), decimals = 2) %>%
fmt_percent(columns = contains("complete"), decimals = 0) %>%
# change all the numeric.mean and numeric.sd column names to Mean and SD
cols_label(
.list = test_scores %>% select(year) %>% distinct() %>% transmute(col = paste0(year, "__numeric.mean"), label = "Mean") %>% deframe()
) %>%
cols_label(
.list = test_scores %>% select(year) %>% distinct() %>% transmute(col = paste0(year, "__numeric.sd"), label = "SD") %>% deframe()
) %>%
data_color(
columns = contains("numeric.mean"),
colors = scales::col_numeric(palette = c("Greens"), domain = NULL)
)
| 2017 | 2018 | |||||||
|---|---|---|---|---|---|---|---|---|
| complete | n_missing | Mean | SD | complete | n_missing | Mean | SD | |
| A1 | 100% | 0 | 0.95 | 0.22 | 100% | 0 | 0.95 | 0.21 |
| A2 | 100% | 0 | 0.31 | 0.46 | 100% | 0 | 0.28 | 0.45 |
| A3 | 100% | 0 | 0.62 | 0.48 | 100% | 0 | 0.64 | 0.48 |
| A4 | 100% | 0 | 0.64 | 0.48 | 100% | 0 | 0.62 | 0.48 |
| A5 | 100% | 0 | 0.47 | 0.50 | 100% | 0 | 0.49 | 0.50 |
| A6 | 100% | 0 | 0.71 | 0.46 | 100% | 0 | 0.73 | 0.44 |
| A7 | 100% | 0 | 0.71 | 0.46 | 100% | 0 | 0.70 | 0.46 |
| A8 | 100% | 0 | 0.47 | 0.50 | 100% | 0 | 0.50 | 0.50 |
| A9 | 100% | 0 | 0.46 | 0.50 | 100% | 0 | 0.46 | 0.50 |
| A10 | 100% | 0 | 0.54 | 0.50 | 100% | 0 | 0.54 | 0.50 |
| A11 | 100% | 0 | 0.57 | 0.49 | 100% | 0 | 0.58 | 0.49 |
| A12 | 100% | 0 | 0.68 | 0.47 | 100% | 0 | 0.71 | 0.45 |
| A13 | 100% | 0 | 0.37 | 0.48 | 100% | 0 | 0.36 | 0.48 |
| A14 | 100% | 0 | 0.56 | 0.50 | 100% | 0 | 0.57 | 0.49 |
| A15 | 100% | 0 | 0.46 | 0.50 | 100% | 0 | 0.46 | 0.50 |
| A16 | 100% | 0 | 0.19 | 0.39 | 100% | 0 | 0.20 | 0.40 |
| A17 | 100% | 0 | 0.59 | 0.49 | 100% | 0 | 0.60 | 0.49 |
| A18 | 100% | 0 | 0.38 | 0.49 | 100% | 0 | 0.36 | 0.48 |
| A19 | 100% | 0 | 0.35 | 0.48 | 100% | 0 | 0.34 | 0.47 |
| A20 | 100% | 0 | 0.72 | 0.45 | 100% | 0 | 0.72 | 0.45 |
| A21 | 100% | 0 | 0.51 | 0.50 | 100% | 0 | 0.52 | 0.50 |
| A22 | 100% | 0 | 0.52 | 0.50 | 100% | 0 | 0.52 | 0.50 |
| A23 | 100% | 0 | 0.41 | 0.49 | 100% | 0 | 0.43 | 0.50 |
| A24 | 100% | 0 | 0.59 | 0.49 | 100% | 0 | 0.58 | 0.49 |
| A25 | 100% | 0 | 0.54 | 0.50 | 100% | 0 | 0.73 | 0.44 |
| A26 | 100% | 0 | 0.80 | 0.40 | 100% | 0 | 0.80 | 0.40 |
| A27 | 100% | 0 | 0.39 | 0.49 | 100% | 0 | 0.40 | 0.49 |
| A28 | 100% | 0 | 0.42 | 0.49 | 100% | 0 | 0.46 | 0.50 |
| A29 | 100% | 0 | 0.48 | 0.50 | 100% | 0 | 0.52 | 0.50 |
| A30 | 100% | 0 | 0.76 | 0.43 | 100% | 0 | 0.73 | 0.44 |
| A31 | 100% | 0 | 0.38 | 0.48 | 100% | 0 | 0.37 | 0.48 |
| A32 | 100% | 0 | 0.23 | 0.42 | 100% | 0 | 0.20 | 0.40 |
| A33 | 100% | 0 | 0.78 | 0.42 | 100% | 0 | 0.76 | 0.43 |
| A34 | 100% | 0 | 0.58 | 0.49 | 100% | 0 | 0.61 | 0.49 |
| A35 | 100% | 0 | 0.32 | 0.47 | 100% | 0 | 0.33 | 0.47 |
| A36 | 100% | 0 | 0.22 | 0.41 | 100% | 0 | 0.23 | 0.42 |
Before applying IRT, we should check that the data satisfies the assumptions needed by the model. In particular, to use a 1-dimensional IRT model, we should have some evidence of unidimensionality in the test scores.
If the test is unidimensional then we would expect student scores on pairs of items to be correlated.
This plot shows the correlations between scores on each pair of items:
item_scores <- test_scores %>%
select(-class, -year)
cor_ci <- psych::corCi(item_scores, plot = FALSE)
psych::cor.plot.upperLowerCi(cor_ci)
Almost all correlations are significantly different from 0 - the only one that is not is:
cor_ci$ci %>%
as_tibble(rownames = "corr") %>%
filter(p > 0.05) %>%
arrange(-p) %>%
select(-contains(".e")) %>%
gt() %>%
fmt_number(columns = 2:4, decimals = 3)
| corr | lower | upper | p |
|---|---|---|---|
| A1-A2 | −0.006 | 0.056 | 0.116 |
Thus, the overall picture is that the item scores are well correlated with each other.
We use factor analysis as another way to assess unidimensionality. We are looking to see whether a 1-factor solution is feasible.
structure <- check_factorstructure(item_scores)
n <- n_factors(item_scores)
The choice of 6 dimensions is supported by 5 (21.74%) methods out of 23 (Optimal coordinates, Parallel analysis, Kaiser criterion, BIC, BIC).
plot(n)
summary(n) %>% gt()
| n_Factors | n_Methods |
|---|---|
| 1 | 4 |
| 2 | 1 |
| 3 | 1 |
| 4 | 1 |
| 6 | 5 |
| 8 | 1 |
| 11 | 3 |
| 26 | 1 |
| 28 | 1 |
| 33 | 1 |
| 34 | 1 |
| 35 | 3 |
#n %>% tibble() %>% gt()
pdf(file = "output/eth_pre_scree.pdf", width = 6, height = 4)
fa.parallel(item_scores, fa = "fa")
## Parallel analysis suggests that the number of factors = 9 and the number of components = NA
dev.off()
## png
## 2
This shows that both the 6-factor and 1-factor solutions are worth exploring.
We use the factanal function to fit a 1-factor model.
Note that this function cannot handle missing data, so any NA scores must be set to 0 for this analysis.
fitfact <- factanal(item_scores,
factors = 1,
rotation = "varimax")
print(fitfact, digits = 2, cutoff = 0.3, sort = TRUE)
##
## Call:
## factanal(x = item_scores, factors = 1, rotation = "varimax")
##
## Uniquenesses:
## A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12 A13 A14 A15 A16
## 0.95 0.95 0.75 0.79 0.72 0.74 0.79 0.79 0.61 0.65 0.67 0.88 0.91 0.73 0.87 0.81
## A17 A18 A19 A20 A21 A22 A23 A24 A25 A26 A27 A28 A29 A30 A31 A32
## 0.78 0.78 0.78 0.63 0.59 0.61 0.64 0.73 0.94 0.69 0.63 0.64 0.85 0.81 0.78 0.78
## A33 A34 A35 A36
## 0.88 0.86 0.79 0.84
##
## Loadings:
## [1] 0.53 0.51 0.62 0.59 0.57 0.52 0.61 0.64 0.62 0.60 0.52 0.56 0.61 0.60
## [16] 0.50 0.45 0.46 0.46 0.35 0.31 0.36 0.44 0.47 0.46 0.47 0.39 0.44
## [31] 0.46 0.47 0.34 0.38 0.46 0.40
##
## Factor1
## SS loadings 8.35
## Proportion Var 0.23
##
## Test of the hypothesis that 1 factor is sufficient.
## The chi square statistic is 5447.17 on 594 degrees of freedom.
## The p-value is 0
load <- tidy(fitfact)
ggplot(load, aes(x = fl1, y = 0)) +
geom_point() +
geom_label_repel(aes(label = paste0("A", rownames(load))), show.legend = FALSE) +
labs(x = "Factor 1", y = NULL,
title = "Standardised Loadings",
subtitle = "Based upon correlation matrix") +
theme_minimal()
## Warning: ggrepel: 14 unlabeled data points (too many overlaps). Consider
## increasing max.overlaps
load %>%
select(question = variable, factor_loading = fl1) %>%
left_join(item_info, by = "question") %>%
arrange(-factor_loading) %>%
gt() %>%
data_color(
columns = contains("factor"),
colors = scales::col_numeric(palette = c("Greens"), domain = NULL)
)
| question | factor_loading | description |
|---|---|---|
| A21 | 0.6407106 | (ln(sin))' |
| A9 | 0.6228304 | trig.ineq. |
| A22 | 0.6223752 | hyp.functions |
| A27 | 0.6076260 | int(exp) |
| A20 | 0.6061690 | (exp)' |
| A23 | 0.5998775 | slope tangent |
| A28 | 0.5961137 | Int = 0 |
| A10 | 0.5878608 | trig.identity |
| A11 | 0.5709488 | period |
| A26 | 0.5590386 | int(poly) |
| A5 | 0.5324571 | logs |
| A14 | 0.5232019 | limit |
| A24 | 0.5215894 | IVT |
| A6 | 0.5086087 | logs |
| A3 | 0.4972984 | arithmetic rules |
| A19 | 0.4718772 | quotient rule |
| A17 | 0.4705175 | graph f' |
| A32 | 0.4683131 | FtoC algebra |
| A18 | 0.4646006 | product rule |
| A31 | 0.4643091 | int(abs) |
| A8 | 0.4602708 | sine pi/3 |
| A7 | 0.4593494 | graph translation |
| A35 | 0.4579597 | intersect planes |
| A4 | 0.4547066 | Easy ineq. |
| A30 | 0.4376910 | FtoC graph |
| A16 | 0.4367742 | diff.quotient |
| A36 | 0.4014553 | parallel planes |
| A29 | 0.3875447 | int even funct |
| A34 | 0.3765420 | normal vector |
| A15 | 0.3603866 | series |
| A12 | 0.3513586 | rational exponents |
| A33 | 0.3415341 | difference vectors |
| A13 | 0.3075858 | ellipsoid |
| A25 | 0.2364578 | velocity |
| A2 | 0.2342048 | 3D |
| A1 | 0.2277957 | linear function |
This factor appears to be dominated by “procedural calculus skills”: standard calculation of derivatives, integral and some trigonometry.
Here we also investigate the 6-factor solution, to see whether these factors are interpretable.
fitfact6 <- factanal(item_scores, factors = 6, rotation = "varimax")
print(fitfact6, digits = 2, cutoff = 0.3, sort = TRUE)
##
## Call:
## factanal(x = item_scores, factors = 6, rotation = "varimax")
##
## Uniquenesses:
## A1 A2 A3 A4 A5 A6 A7 A8 A9 A10 A11 A12 A13 A14 A15 A16
## 0.91 0.81 0.74 0.75 0.68 0.74 0.72 0.76 0.53 0.64 0.64 0.87 0.80 0.71 0.82 0.72
## A17 A18 A19 A20 A21 A22 A23 A24 A25 A26 A27 A28 A29 A30 A31 A32
## 0.69 0.41 0.00 0.47 0.53 0.59 0.60 0.63 0.91 0.58 0.60 0.61 0.83 0.71 0.76 0.75
## A33 A34 A35 A36
## 0.84 0.70 0.56 0.70
##
## Loadings:
## Factor1 Factor2 Factor3 Factor4 Factor5 Factor6
## A9 0.54 0.37
## A23 0.50
## A28 0.50
## A24 0.52
## A20 0.59
## A18 0.69
## A19 0.95
## A35 0.58
## A1
## A2 0.40
## A3 0.31
## A4 0.33
## A5 0.38
## A6
## A7 0.44
## A8 0.39
## A10 0.45
## A11 0.43 0.36
## A12
## A13 0.34
## A14 0.32
## A15
## A16 0.43
## A17 0.48
## A21 0.46 0.43
## A22 0.41 0.36
## A25
## A26 0.34 0.48
## A27 0.45 0.32
## A29
## A30 0.48
## A31 0.33
## A32 0.41
## A33 0.33
## A34 0.47
## A36 0.43
##
## Factor1 Factor2 Factor3 Factor4 Factor5 Factor6
## SS loadings 3.5 2.65 1.68 1.62 1.26 0.98
## Proportion Var 0.1 0.07 0.05 0.05 0.04 0.03
## Cumulative Var 0.1 0.17 0.22 0.26 0.30 0.32
##
## Test of the hypothesis that 6 factors are sufficient.
## The chi square statistic is 932.36 on 429 degrees of freedom.
## The p-value is 2.36e-39
load6 <- tidy(fitfact6)
ggplot(load6, aes(x = fl1, y = fl2)) +
geom_point() +
geom_label_repel(aes(label = paste0("A", rownames(load))), show.legend = FALSE) +
labs(x = "Factor 1", y = "Factor 2",
title = "Standardised Loadings",
subtitle = "Based upon correlation matrix") +
theme_minimal()
main_factors <- load6 %>%
# mutate(factorNone = 0.4) %>% # add this to set the main factor to "None" where all loadings are below 0.4
pivot_longer(names_to = "factor",
cols = contains("fl")) %>%
mutate(value_abs = abs(value)) %>%
group_by(variable) %>%
top_n(1, value_abs) %>%
ungroup() %>%
transmute(main_factor = factor, variable)
library(kableExtra)
##
## Attaching package: 'kableExtra'
## The following object is masked from 'package:dplyr':
##
## group_rows
load6 %>%
select(-uniqueness) %>%
# add the info about which is the main factor
left_join(main_factors) %>%
left_join(item_info %>% select(variable = question, description)) %>%
arrange(main_factor) %>%
select(main_factor, everything()) %>%
# arrange adjectives by descending loading on main factor
rowwise() %>%
mutate(max_loading = max(abs(c_across(starts_with("fl"))))) %>%
group_by(main_factor) %>%
arrange(-max_loading, .by_group = TRUE) %>%
select(-max_loading) %>%
# sort out the presentation
rename("Main Factor" = main_factor, # the _ throws a latex error
"Question" = variable) %>%
mutate_at(
vars(starts_with("fl")),
~ cell_spec(round(., digits = 3), bold = if_else(abs(.) > 0.4, T, F))
) %>%
kable(booktabs = T, escape = F, longtable = T) %>%
kableExtra::collapse_rows(columns = 1, valign = "top") %>%
kableExtra::kable_styling(latex_options = c("repeat_header"))
## Joining, by = "variable"
## Joining, by = "variable"
| Main Factor | Question | fl1 | fl2 | fl3 | fl4 | fl5 | fl6 | description |
|---|---|---|---|---|---|---|---|---|
| fl1 | A9 | 0.543 | 0.371 | 0.045 | 0.1 | 0.13 | 0.086 | trig.ineq. |
| A23 | 0.504 | 0.289 | 0.189 | 0.085 | 0.136 | 0.023 | slope tangent | |
| A28 | 0.503 | 0.292 | 0.145 | 0.094 | 0.147 | 0.048 | Int = 0 | |
| A21 | 0.456 | 0.194 | 0.429 | 0.13 | 0.143 | 0.07 | (ln(sin))’ | |
| A27 | 0.453 | 0.193 | 0.316 | 0.086 | 0.216 | 0.075 | int(exp) | |
| A10 | 0.447 | 0.239 | 0.234 | 0.104 | 0.084 | 0.182 | trig.identity | |
| A16 | 0.427 | -0.017 | 0.141 | 0.129 | 0.128 | 0.217 | diff.quotient | |
| A11 | 0.426 | 0.357 | 0.127 | 0.065 | 0.082 | 0.15 | period | |
| A22 | 0.408 | 0.237 | 0.36 | 0.108 | 0.21 | 0.073 | hyp.functions | |
| A32 | 0.407 | 0.122 | 0.106 | 0.104 | 0.19 | 0.115 | FtoC algebra | |
| A8 | 0.386 | 0.145 | 0.138 | 0.12 | 0.058 | 0.186 | sine pi/3 | |
| A5 | 0.379 | 0.182 | 0.2 | 0.093 | 0.114 | 0.278 | logs | |
| A31 | 0.325 | 0.262 | 0.059 | 0.07 | 0.178 | 0.151 | int(abs) | |
| A14 | 0.324 | 0.176 | 0.266 | 0.118 | 0.18 | 0.18 | limit | |
| A3 | 0.313 | 0.212 | 0.247 | 0.085 | 0.08 | 0.213 | arithmetic rules | |
| A6 | 0.293 | 0.268 | 0.257 | 0.072 | 0.122 | 0.136 | logs | |
| fl2 | A24 | 0.272 | 0.516 | 0.074 | 0.079 | 0.134 | 0.049 | IVT |
| A30 | 0.171 | 0.484 | 0.086 | 0.051 | 0.094 | 0.08 | FtoC graph | |
| A17 | 0.193 | 0.478 | 0.118 | 0.073 | 0.061 | 0.137 | graph f’ | |
| A7 | 0.167 | 0.443 | 0.157 | 0.036 | 0.095 | 0.161 | graph translation | |
| A4 | 0.225 | 0.334 | 0.118 | 0.064 | 0.084 | 0.249 | Easy ineq. | |
| A33 | 0.101 | 0.328 | 0.089 | 0.036 | 0.113 | 0.158 | difference vectors | |
| A1 | 0.046 | 0.27 | 0.089 | 0.027 | 0.029 | 0.06 | linear function | |
| A25 | 0.056 | 0.264 | 0.099 | 0.009 | 0.055 | 0.054 | velocity | |
| A12 | 0.185 | 0.191 | 0.143 | 0.083 | 0.072 | 0.168 | rational exponents | |
| fl3 | A20 | 0.286 | 0.268 | 0.593 | 0.09 | 0.115 | 0.059 | (exp)’ |
| A26 | 0.221 | 0.338 | 0.476 | 0.11 | 0.142 | 0.002 | int(poly) | |
| A29 | 0.23 | 0.126 | 0.27 | 0.093 | 0.145 | 0.03 | int even funct | |
| fl4 | A19 | 0.227 | 0.071 | 0.129 | 0.951 | 0.101 | 0.085 | quotient rule |
| A18 | 0.211 | 0.138 | 0.135 | 0.692 | 0.114 | 0.106 | product rule | |
| fl5 | A35 | 0.216 | 0.133 | 0.114 | 0.104 | 0.583 | 0.102 | intersect planes |
| A34 | 0.109 | 0.185 | 0.169 | 0.057 | 0.47 | 0.006 | normal vector | |
| A36 | 0.223 | 0.106 | 0.06 | 0.045 | 0.43 | 0.212 | parallel planes | |
| fl6 | A2 | 0.093 | 0.117 | 0.011 | 0.046 | 0.024 | 0.403 | 3D |
| A13 | 0.099 | 0.255 | -0.006 | 0.044 | 0.106 | 0.339 | ellipsoid | |
| A15 | 0.154 | 0.233 | 0.084 | 0.023 | 0.143 | 0.28 | series |
Here, the first factor seems better described by “abstract understanding of calculus”. The second is based on “graphical understanding of calculus”.
The third and fourth factors are “procedural calculus skills”, with the product and quotient rules distinguished in the fourth factor.
The fifth and sixth factors have a geometrical flavour, with the fifth factor in particular being about “vector geometry”.
The 6-factor solution does seem to be interpretable, however the 1-factor solution is still reasonable. In terms of explaining the variance in the data, the 1-factor solution does a good job at capturing a large chunk:
# Compute the proportion of variance explained; see https://stackoverflow.com/a/58159992/17459126
x1 <- loadings(fitfact)
propvar1 <- colSums(x1^2)/nrow(x1)
rbind(`Proportion Var` = propvar1, `Cumulative Var` = cumsum(propvar1))
## Factor1
## Proportion Var 0.2320524
## Cumulative Var 0.2320524
x6 <- loadings(fitfact6)
propvar6 <- colSums(x6^2)/nrow(x6)
rbind(`Proportion Var` = propvar6, `Cumulative Var` = cumsum(propvar6))
## Factor1 Factor2 Factor3 Factor4 Factor5
## Proportion Var 0.09717865 0.07370191 0.04679681 0.04500204 0.03500279
## Cumulative Var 0.09717865 0.17088056 0.21767737 0.26267941 0.29768220
## Factor6
## Proportion Var 0.02729802
## Cumulative Var 0.32498023
Therefore, we are justified in proceeding with a unidimensional IRT model.
We can fit a Multidimensional Item Response Theory (mirt) model. From the function definition:
mirt fits a maximum likelihood (or maximum a posteriori) factor analysis model to any mixture of dichotomous and polytomous data under the item response theory paradigm using either Cai's (2010) Metropolis-Hastings Robbins-Monro (MHRM) algorithm.
The process is to first fit the model, and save the result as a model object that we can then parse to get tabular or visual displays of the model that we might want. When fitting the model, we have the option to specify a few arguments, which then get interpreted as parameters to be passed to the model.
Show model fitting output
fit_2pl <- mirt(
data = item_scores, # just the columns with question scores
model = 1, # number of factors to extract
itemtype = "2PL", # 2 parameter logistic model
SE = TRUE # estimate standard errors
)
##
Iteration: 1, Log-Lik: -66668.278, Max-Change: 0.71582
Iteration: 2, Log-Lik: -65483.178, Max-Change: 0.26188
Iteration: 3, Log-Lik: -65316.598, Max-Change: 0.14767
Iteration: 4, Log-Lik: -65249.858, Max-Change: 0.11072
Iteration: 5, Log-Lik: -65212.345, Max-Change: 0.09658
Iteration: 6, Log-Lik: -65190.906, Max-Change: 0.06270
Iteration: 7, Log-Lik: -65178.102, Max-Change: 0.06095
Iteration: 8, Log-Lik: -65169.976, Max-Change: 0.03668
Iteration: 9, Log-Lik: -65165.338, Max-Change: 0.03883
Iteration: 10, Log-Lik: -65162.425, Max-Change: 0.02443
Iteration: 11, Log-Lik: -65160.764, Max-Change: 0.01661
Iteration: 12, Log-Lik: -65159.750, Max-Change: 0.01247
Iteration: 13, Log-Lik: -65158.047, Max-Change: 0.00237
Iteration: 14, Log-Lik: -65158.024, Max-Change: 0.00195
Iteration: 15, Log-Lik: -65158.012, Max-Change: 0.00168
Iteration: 16, Log-Lik: -65157.995, Max-Change: 0.00082
Iteration: 17, Log-Lik: -65157.990, Max-Change: 0.00089
Iteration: 18, Log-Lik: -65157.986, Max-Change: 0.00072
Iteration: 19, Log-Lik: -65157.976, Max-Change: 0.00018
Iteration: 20, Log-Lik: -65157.975, Max-Change: 0.00020
Iteration: 21, Log-Lik: -65157.975, Max-Change: 0.00019
Iteration: 22, Log-Lik: -65157.974, Max-Change: 0.00012
Iteration: 23, Log-Lik: -65157.973, Max-Change: 0.00012
Iteration: 24, Log-Lik: -65157.973, Max-Change: 0.00011
Iteration: 25, Log-Lik: -65157.973, Max-Change: 0.00009
##
## Calculating information matrix...
We compute Yen’s \(Q_3\) (1984) to check for any dependence between items after controlling for \(\theta\). This gives a score for each pair of items, with scores above 0.2 regarded as problematic (see DeMars, p. 48).
residuals_2pl %>% as.matrix() %>%
corrplot::corrplot(type = "upper")
This shows that most item pairs are independent, with only one pair showing cause for concern:
residuals_2pl %>%
rownames_to_column(var = "q1") %>%
as_tibble() %>%
pivot_longer(cols = starts_with("A"), names_to = "q2", values_to = "Q3_score") %>%
filter(abs(Q3_score) > 0.2) %>%
filter(parse_number(q1) < parse_number(q2)) %>%
gt()
| q1 | q2 | Q3_score |
|---|---|---|
| A18 | A19 | 0.6740812 |
| A34 | A35 | 0.2109123 |
Items A18 and A19 are on the product and quotient rules.
Given that this violation of the local independence assumption is very mild, we proceed using this model.
We then compute factor score estimates and augment the existing data frame with these estimates, to keep everything in one place. To do the estimation, we use the fscores() function from the mirt package which takes in a computed model object and computes factor score estimates according to the method specified. We will use the EAP method for factor score estimation, which is the “expected a-posteriori” method, the default. We specify it explicitly below, but the results would have been the same if we omitted specifying the method argument since it’s the default method the function uses.
test_scores_with_2pl_ability <- test_scores %>%
mutate(F1 = fscores(fit_2pl, method = "EAP"))
We can also calculate the model coefficient estimates using a generic function coef() which is used to extract model coefficients from objects returned by modeling functions. We will set the IRTpars argument to TRUE, which means slope intercept parameters will be converted into traditional IRT parameters.
coefs_2pl <- coef(fit_2pl, IRTpars = TRUE)
The resulting object coefs is a list, with one element for each question, and an additional GroupPars element that we won’t be using. The output is a bit long, so we’re only showing a few of the elements here:
coefs_2pl[1:3]
## $A1
## a b g u
## par 1.219104 -2.946041 0 1
## CI_2.5 1.024917 -3.298279 NA NA
## CI_97.5 1.413291 -2.593803 NA NA
##
## $A2
## a b g u
## par 0.5612745 1.670103 0 1
## CI_2.5 0.4752132 1.405353 NA NA
## CI_97.5 0.6473358 1.934852 NA NA
##
## $A3
## a b g u
## par 1.354723 -0.5385656 0 1
## CI_2.5 1.236060 -0.6126183 NA NA
## CI_97.5 1.473386 -0.4645129 NA NA
# coefs_2pl[35:37]
Let’s take a closer look at the first element:
coefs_2pl[1]
## $A1
## a b g u
## par 1.219104 -2.946041 0 1
## CI_2.5 1.024917 -3.298279 NA NA
## CI_97.5 1.413291 -2.593803 NA NA
In this output:
a is discriminationb is difficultyTo make this output a little more user friendly, we should tidy it such that we have a row per question. We’ll do this in two steps. First, write a function that tidies the output for one question, i.e. one list element. Then, map this function over the list of all questions, resulting in a data frame.
tidy_mirt_coefs <- function(x){
x %>%
# melt the list element
melt() %>%
# convert to a tibble
as_tibble() %>%
# convert factors to characters
mutate(across(where(is.factor), as.character)) %>%
# only focus on rows where X2 is a or b (discrimination or difficulty), or g (guessing parameter in 3PL)
filter(X2 %in% c("a", "b", "g")) %>%
# in X1, relabel par (parameter) as est (estimate)
mutate(X1 = if_else(X1 == "par", "est", X1)) %>%
# unite columns X2 and X1 into a new column called var separated by _
unite(X2, X1, col = "var", sep = "_") %>%
# turn into a wider data frame
pivot_wider(names_from = var, values_from = value)
}
Let’s see what this does to a single element in coefs:
tidy_mirt_coefs(coefs_2pl[1])
## # A tibble: 1 x 10
## L1 a_est a_CI_2.5 a_CI_97.5 b_est b_CI_2.5 b_CI_97.5 g_est g_CI_2.5
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 A1 1.22 1.02 1.41 -2.95 -3.30 -2.59 0 NA
## # ... with 1 more variable: g_CI_97.5 <dbl>
And now let’s map it over all 32 elements of coefs:
# use head(., -1) to remove the last element, `GroupPars`, which does not correspond to a question
tidy_2pl <- map_dfr(head(coefs_2pl, -1), tidy_mirt_coefs, .id = "Question")
A quick peek at the result:
tidy_2pl
## # A tibble: 36 x 10
## Question a_est a_CI_2.5 a_CI_97.5 b_est b_CI_2.5 b_CI_97.5 g_est g_CI_2.5
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 A1 1.22 1.02 1.41 -2.95 -3.30 -2.59 0 NA
## 2 A2 0.561 0.475 0.647 1.67 1.41 1.93 0 NA
## 3 A3 1.35 1.24 1.47 -0.539 -0.613 -0.465 0 NA
## 4 A4 1.18 1.07 1.29 -0.587 -0.670 -0.505 0 NA
## 5 A5 1.47 1.34 1.59 0.0797 0.0159 0.143 0 NA
## 6 A6 1.56 1.42 1.70 -0.859 -0.938 -0.781 0 NA
## 7 A7 1.29 1.17 1.41 -0.885 -0.975 -0.795 0 NA
## 8 A8 1.16 1.06 1.27 0.0659 -0.00726 0.139 0 NA
## 9 A9 2.02 1.86 2.18 0.128 0.0731 0.183 0 NA
## 10 A10 1.77 1.63 1.91 -0.145 -0.203 -0.0867 0 NA
## # ... with 26 more rows, and 1 more variable: g_CI_97.5 <dbl>
And a nicely formatted table of the result:
tidy_2pl %>%
# remove the redundant g_ variables which only make sense for the 3PL model
select(-starts_with("g_")) %>%
gt() %>%
fmt_number(columns = contains("_"), decimals = 3) %>%
data_color(
columns = contains("a_"),
colors = scales::col_numeric(palette = c("Greens"), domain = NULL)
) %>%
data_color(
columns = contains("b_"),
colors = scales::col_numeric(palette = c("Blues"), domain = NULL)
) %>%
tab_spanner(label = "Discrimination", columns = contains("a_")) %>%
tab_spanner(label = "Difficulty", columns = contains("b_")) %>%
cols_label(
a_est = "Est.",
b_est = "Est.",
a_CI_2.5 = "2.5%",
b_CI_2.5 = "2.5%",
a_CI_97.5 = "97.5%",
b_CI_97.5 = "97.5%"
)
| Question | Discrimination | Difficulty | ||||
|---|---|---|---|---|---|---|
| Est. | 2.5% | 97.5% | Est. | 2.5% | 97.5% | |
| A1 | 1.219 | 1.025 | 1.413 | −2.946 | −3.298 | −2.594 |
| A2 | 0.561 | 0.475 | 0.647 | 1.670 | 1.405 | 1.935 |
| A3 | 1.355 | 1.236 | 1.473 | −0.539 | −0.613 | −0.465 |
| A4 | 1.180 | 1.072 | 1.289 | −0.587 | −0.670 | −0.505 |
| A5 | 1.466 | 1.344 | 1.589 | 0.080 | 0.016 | 0.143 |
| A6 | 1.558 | 1.420 | 1.695 | −0.859 | −0.938 | −0.781 |
| A7 | 1.287 | 1.168 | 1.406 | −0.885 | −0.975 | −0.795 |
| A8 | 1.162 | 1.057 | 1.268 | 0.066 | −0.007 | 0.139 |
| A9 | 2.016 | 1.856 | 2.176 | 0.128 | 0.073 | 0.183 |
| A10 | 1.768 | 1.625 | 1.910 | −0.145 | −0.203 | −0.087 |
| A11 | 1.684 | 1.546 | 1.821 | −0.278 | −0.339 | −0.217 |
| A12 | 0.889 | 0.792 | 0.987 | −1.088 | −1.219 | −0.957 |
| A13 | 0.718 | 0.631 | 0.806 | 0.855 | 0.717 | 0.992 |
| A14 | 1.435 | 1.314 | 1.556 | −0.256 | −0.322 | −0.190 |
| A15 | 0.850 | 0.759 | 0.940 | 0.209 | 0.116 | 0.302 |
| A16 | 1.539 | 1.390 | 1.687 | 1.295 | 1.194 | 1.397 |
| A17 | 1.227 | 1.117 | 1.337 | −0.404 | −0.479 | −0.329 |
| A18 | 1.237 | 1.125 | 1.350 | 0.550 | 0.471 | 0.629 |
| A19 | 1.309 | 1.192 | 1.427 | 0.639 | 0.561 | 0.718 |
| A20 | 2.326 | 2.126 | 2.525 | −0.738 | −0.799 | −0.677 |
| A21 | 2.137 | 1.967 | 2.306 | −0.047 | −0.100 | 0.006 |
| A22 | 1.999 | 1.840 | 2.158 | −0.053 | −0.108 | 0.002 |
| A23 | 1.875 | 1.724 | 2.026 | 0.268 | 0.211 | 0.326 |
| A24 | 1.439 | 1.317 | 1.561 | −0.314 | −0.381 | −0.247 |
| A25 | 0.534 | 0.453 | 0.615 | −1.128 | −1.334 | −0.922 |
| A26 | 2.424 | 2.199 | 2.648 | −1.035 | −1.104 | −0.966 |
| A27 | 1.994 | 1.833 | 2.155 | 0.350 | 0.293 | 0.408 |
| A28 | 1.850 | 1.702 | 1.999 | 0.212 | 0.154 | 0.269 |
| A29 | 0.913 | 0.820 | 1.006 | −0.000 | −0.087 | 0.086 |
| A30 | 1.263 | 1.142 | 1.384 | −1.084 | −1.186 | −0.983 |
| A31 | 1.250 | 1.138 | 1.363 | 0.539 | 0.461 | 0.617 |
| A32 | 1.604 | 1.454 | 1.754 | 1.153 | 1.062 | 1.244 |
| A33 | 0.936 | 0.830 | 1.042 | −1.506 | −1.667 | −1.345 |
| A34 | 0.906 | 0.812 | 1.000 | −0.513 | −0.609 | −0.416 |
| A35 | 1.281 | 1.165 | 1.397 | 0.734 | 0.650 | 0.817 |
| A36 | 1.246 | 1.122 | 1.370 | 1.278 | 1.164 | 1.393 |
tidy_2pl %>%
mutate(qnum = parse_number(Question)) %>%
ggplot(aes(
x = a_est,
y = b_est
)) +
geom_errorbar(aes(ymin = b_CI_2.5, ymax = b_CI_97.5), width = 0, alpha = 0.5) +
geom_errorbar(aes(xmin = a_CI_2.5, xmax = a_CI_97.5), width = 0, alpha = 0.5) +
geom_text_repel(aes(label = Question), alpha = 0.5) +
geom_point() +
theme_minimal() +
labs(x = "Discrimination",
y = "Difficulty")
tidy_2pl %>%
write_csv("output/eth_pre_2pl-results.csv")
Do students from different programmes of study have different distributions of ability?
Compare the distribution of abilities in the year groups (though in this case there is only one).
ggplot(test_scores_with_2pl_ability, aes(F1, fill = as.factor(year), colour = as.factor(year))) +
geom_density(alpha=0.5) +
scale_x_continuous(limits = c(-3.5,3.5)) +
labs(title = "Density plot",
subtitle = "Ability grouped by year of taking the test",
x = "Ability", y = "Density",
fill = "Year", colour = "Year") +
theme_minimal()
Compare the distribution of abilities in the various classes.
ggplot(test_scores_with_2pl_ability, aes(x = F1, y = class, colour = class, fill = class)) +
geom_density_ridges(alpha = 0.5) +
scale_x_continuous(limits = c(-3.5,3.5)) +
guides(fill = FALSE, colour = FALSE) +
labs(title = "Density plot",
subtitle = "Ability grouped by class of taking the test",
x = "Ability", y = "Class") +
theme_minimal()
## Picking joint bandwidth of 0.194
plot(fit_2pl, type = "infoSE", main = "Test information")
plot(fit_2pl, type = "infotrace", main = "Item information curves")
Theta <- matrix(seq(-4,4,.01))
tinfo <- testinfo(fit_2pl, Theta)
plot(Theta, tinfo, type = 'l')
info_2pl <- areainfo(fit_2pl, c(-4,4))
info_2pl %>% gt()
| LowerBound | UpperBound | Info | TotalInfo | Proportion | nitems |
|---|---|---|---|---|---|
| -4 | 4 | 49.13213 | 50.42239 | 0.9744109 | 36 |
This shows that the total information in the test is 50.4223874.
plot_2pl_curve <- function(a,b,x,...) {exp(1.7*a*(x-b))/(1+exp(1.7*a*(x-b)))}
tidy_2pl %>%
crossing(data.frame(x = seq(
from = -6,
to = 6,
by = 0.05
))) %>%
mutate(a = a_est, b = b_est) %>%
mutate(
y = pmap_dbl(., plot_2pl_curve)
) %>%
mutate(Question = fct_reorder(Question, parse_number(Question))) %>%
ggplot(aes(x, y, colour = Question, alpha = a_est, label = Question)) +
geom_line()
plot(fit_2pl, type = "score", auto.key = FALSE)
We can get individual item surface and information plots using the itemplot() function from the mirt package, e.g.
mirt::itemplot(fit_2pl, item = 1,
main = "Trace lines for item 1")
We can also get the plots for all trace lines, one facet per plot.
plot(fit_2pl, type = "trace", auto.key = FALSE)
Or all of them overlaid in one plot.
plot(fit_2pl, type = "trace", facet_items=FALSE)
An alternative approach is using ggplot2 and plotly to add interactivity to make it easier to identify items.
# store the object
plt <- plot(fit_2pl, type = "trace", facet_items = FALSE)
# the data we need is in panel.args
plt_data <- tibble(
x = plt$panel.args[[2]]$x,
y = plt$panel.args[[2]]$y,
subscripts = plt$panel.args[[2]]$subscripts,
item = rep(colnames(item_scores), each = 200)
) %>%
mutate(
item_no = str_remove(item, "A") %>% as.numeric(),
item = fct_reorder(item, item_no)
)
head(plt_data)
## # A tibble: 6 x 5
## x y subscripts item item_no
## <dbl> <dbl> <int> <fct> <dbl>
## 1 -6 0.0236 201 A1 1
## 2 -5.94 0.0253 202 A1 1
## 3 -5.88 0.0272 203 A1 1
## 4 -5.82 0.0292 204 A1 1
## 5 -5.76 0.0314 205 A1 1
## 6 -5.70 0.0337 206 A1 1
plt_gg <- ggplot(plt_data, aes(x, y,
colour = item,
text = item)) +
geom_line() +
labs(
title = "2PL - Trace lines",
#x = expression(theta),
x = "theta",
#y = expression(P(theta)),
y = "P(theta)",
colour = "Item"
) +
theme_minimal() +
theme(legend.position = "none")
ggplotly(plt_gg, tooltip = "text")
The 3 parameter IRT model adds a third “guessing” parameter for each item, which gives the lower asymptote of the response curve.
Given that the test offered an “I don’t know” option (which we have mapped to a score of 0), we would expect that students are less likely to be guessing when answering these multiple-choice questions. So, we would expect to see the lower aymptotes being closer to 0 than would ordinarily be expected for a 4- or 5-option MCQ.
fit_3pl <- mirt(
data = item_scores, # just the columns with question scores
model = 1, # number of factors to extract
itemtype = "3PL", # 3 parameter logistic model
SE = TRUE # estimate standard errors
)
residuals_3pl <- residuals(fit_3pl, type = "Q3") %>% data.frame
coefs_3pl <- coef(fit_3pl, IRTpars = TRUE)
test_scores_with_3pl_ability <- test_scores %>%
mutate(F1 = fscores(fit_3pl, method = "EAP"))
We compute Yen’s \(Q_3\) (1984) to check for any dependence between items after controlling for \(\theta\). This gives a score for each pair of items, with scores above 0.2 regarded as problematic (see DeMars, p. 48).
residuals_3pl %>% as.matrix() %>%
corrplot::corrplot(type = "upper")
This shows that most item pairs are independent, with only a couple of pairs showing cause for concern:
residuals_3pl %>%
rownames_to_column(var = "q1") %>%
as_tibble() %>%
pivot_longer(cols = starts_with("A"), names_to = "q2", values_to = "Q3_score") %>%
filter(abs(Q3_score) > 0.2) %>%
filter(parse_number(q1) < parse_number(q2)) %>%
gt()
| q1 | q2 | Q3_score |
|---|---|---|
| A18 | A19 | 0.6745194 |
| A34 | A35 | 0.2115169 |
Items A18 and A19 are on the product and quotient rules, with an identical structure in each case (giving all the values that need to be used in the relevant formula).
Items A34 and A35 are are both about planes in vector geometry.
Given that this violation of the local independence assumption is very mild, we proceed using this model.
tidy_3pl <- map_dfr(head(coefs_3pl, -1), tidy_mirt_coefs, .id = "Question")
A quick peek at the result:
tidy_3pl
## # A tibble: 36 x 10
## Question a_est a_CI_2.5 a_CI_97.5 b_est b_CI_2.5 b_CI_97.5 g_est
## <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
## 1 A1 1.18 0.987 1.37 -3.01 -3.44 -2.58 0.0181
## 2 A2 0.828 0.513 1.14 1.81 1.55 2.07 0.105
## 3 A3 1.66 1.38 1.94 -0.233 -0.419 -0.0468 0.150
## 4 A4 1.38 1.14 1.62 -0.302 -0.546 -0.0572 0.130
## 5 A5 1.74 1.49 2.00 0.222 0.112 0.332 0.0708
## 6 A6 1.66 1.40 1.92 -0.717 -0.943 -0.491 0.0829
## 7 A7 1.29 1.12 1.47 -0.864 -1.11 -0.616 0.00980
## 8 A8 1.19 1.01 1.37 0.0938 -0.0734 0.261 0.00985
## 9 A9 2.04 1.82 2.26 0.142 0.0724 0.212 0.00233
## 10 A10 2.08 1.81 2.35 -0.00746 -0.104 0.0886 0.0724
## # ... with 26 more rows, and 2 more variables: g_CI_2.5 <dbl>, g_CI_97.5 <dbl>
And a nicely formatted table of the result:
gt(tidy_3pl) %>%
fmt_number(columns = contains("_"), decimals = 3) %>%
data_color(
columns = contains("a_"),
colors = scales::col_numeric(palette = c("Greens"), domain = NULL)
) %>%
data_color(
columns = contains("b_"),
colors = scales::col_numeric(palette = c("Blues"), domain = NULL)
) %>%
data_color(
columns = contains("g_"),
colors = scales::col_numeric(palette = c("Reds"), domain = NULL)
) %>%
tab_spanner(label = "Discrimination", columns = contains("a_")) %>%
tab_spanner(label = "Difficulty", columns = contains("b_")) %>%
tab_spanner(label = "Guessing", columns = contains("g_")) %>%
cols_label(
a_est = "Est.",
b_est = "Est.",
g_est = "Est.",
a_CI_2.5 = "2.5%",
b_CI_2.5 = "2.5%",
g_CI_2.5 = "2.5%",
a_CI_97.5 = "97.5%",
b_CI_97.5 = "97.5%",
g_CI_97.5 = "97.5%"
)
| Question | Discrimination | Difficulty | Guessing | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Est. | 2.5% | 97.5% | Est. | 2.5% | 97.5% | Est. | 2.5% | 97.5% | |
| A1 | 1.177 | 0.987 | 1.367 | −3.012 | −3.442 | −2.581 | 0.018 | −0.170 | 0.206 |
| A2 | 0.828 | 0.513 | 1.144 | 1.806 | 1.546 | 2.066 | 0.105 | 0.025 | 0.184 |
| A3 | 1.662 | 1.383 | 1.941 | −0.233 | −0.419 | −0.047 | 0.150 | 0.063 | 0.237 |
| A4 | 1.379 | 1.137 | 1.620 | −0.302 | −0.546 | −0.057 | 0.130 | 0.022 | 0.237 |
| A5 | 1.743 | 1.485 | 2.000 | 0.222 | 0.112 | 0.332 | 0.071 | 0.021 | 0.120 |
| A6 | 1.663 | 1.404 | 1.922 | −0.717 | −0.943 | −0.491 | 0.083 | −0.040 | 0.206 |
| A7 | 1.290 | 1.115 | 1.465 | −0.864 | −1.111 | −0.616 | 0.010 | −0.113 | 0.132 |
| A8 | 1.188 | 1.008 | 1.368 | 0.094 | −0.073 | 0.261 | 0.010 | −0.059 | 0.079 |
| A9 | 2.039 | 1.816 | 2.262 | 0.142 | 0.072 | 0.212 | 0.002 | −0.024 | 0.029 |
| A10 | 2.080 | 1.806 | 2.355 | −0.007 | −0.104 | 0.089 | 0.072 | 0.026 | 0.119 |
| A11 | 1.896 | 1.636 | 2.155 | −0.148 | −0.266 | −0.029 | 0.068 | 0.008 | 0.127 |
| A12 | 0.962 | 0.701 | 1.224 | −0.806 | −1.576 | −0.036 | 0.112 | −0.181 | 0.405 |
| A13 | 0.879 | 0.629 | 1.129 | 1.043 | 0.788 | 1.298 | 0.077 | −0.015 | 0.169 |
| A14 | 1.642 | 1.403 | 1.880 | −0.093 | −0.235 | 0.049 | 0.080 | 0.013 | 0.147 |
| A15 | 0.978 | 0.717 | 1.239 | 0.420 | 0.081 | 0.760 | 0.078 | −0.047 | 0.204 |
| A16 | 1.729 | 1.455 | 2.003 | 1.275 | 1.179 | 1.371 | 0.013 | −0.005 | 0.031 |
| A17 | 1.228 | 1.118 | 1.339 | −0.395 | −0.473 | −0.318 | 0.001 | −0.007 | 0.009 |
| A18 | 1.253 | 1.138 | 1.367 | 0.552 | 0.474 | 0.630 | 0.000 | −0.003 | 0.004 |
| A19 | 1.327 | 1.206 | 1.447 | 0.641 | 0.562 | 0.719 | 0.000 | −0.004 | 0.005 |
| A20 | 2.322 | 2.021 | 2.624 | −0.720 | −0.844 | −0.596 | 0.007 | −0.069 | 0.084 |
| A21 | 2.151 | 1.979 | 2.322 | −0.034 | −0.088 | 0.019 | 0.000 | −0.002 | 0.002 |
| A22 | 2.066 | 1.858 | 2.274 | −0.022 | −0.091 | 0.047 | 0.012 | −0.014 | 0.038 |
| A23 | 1.893 | 1.739 | 2.046 | 0.277 | 0.220 | 0.335 | 0.000 | −0.002 | 0.003 |
| A24 | 1.443 | 1.320 | 1.565 | −0.304 | −0.372 | −0.236 | 0.001 | −0.006 | 0.007 |
| A25 | 0.536 | 0.454 | 0.618 | −1.110 | −1.365 | −0.855 | 0.003 | −0.038 | 0.045 |
| A26 | 2.387 | 2.165 | 2.608 | −1.034 | −1.106 | −0.963 | 0.001 | −0.008 | 0.010 |
| A27 | 2.019 | 1.855 | 2.184 | 0.358 | 0.301 | 0.415 | 0.000 | −0.001 | 0.001 |
| A28 | 1.871 | 1.718 | 2.025 | 0.221 | 0.163 | 0.279 | 0.000 | −0.005 | 0.006 |
| A29 | 0.920 | 0.826 | 1.014 | 0.006 | −0.081 | 0.092 | 0.000 | −0.004 | 0.005 |
| A30 | 1.250 | 1.125 | 1.376 | −1.081 | −1.215 | −0.946 | 0.005 | −0.044 | 0.053 |
| A31 | 1.439 | 1.197 | 1.681 | 0.622 | 0.508 | 0.737 | 0.045 | −0.003 | 0.092 |
| A32 | 1.635 | 1.481 | 1.789 | 1.142 | 1.053 | 1.232 | 0.000 | −0.001 | 0.002 |
| A33 | 0.929 | 0.814 | 1.044 | −1.496 | −1.779 | −1.212 | 0.010 | −0.099 | 0.119 |
| A34 | 0.908 | 0.809 | 1.006 | −0.500 | −0.630 | −0.370 | 0.003 | −0.031 | 0.038 |
| A35 | 1.306 | 1.149 | 1.462 | 0.737 | 0.644 | 0.830 | 0.002 | −0.023 | 0.027 |
| A36 | 1.644 | 1.321 | 1.967 | 1.276 | 1.172 | 1.379 | 0.044 | 0.017 | 0.071 |
tidy_3pl %>%
mutate(qnum = parse_number(Question)) %>%
ggplot(aes(
x = qnum,
y = g_est
)) +
geom_errorbar(aes(ymin = g_CI_2.5, ymax = g_CI_97.5), width = 0, alpha = 0.5) +
geom_text_repel(aes(label = Question), alpha = 0.5) +
geom_point() +
theme_minimal() +
labs(x = "Question",
y = "Guessing parameter")
tidy_3pl %>%
write_csv("output/eth_pre_3pl-results.csv")
TODO some comment on the 3PL results - i.e. looks like guessing is indeed rare, but perhaps happening a bit in some questions (any ideas why?)
In this analysis we used the following packages. You can learn more about each one by clicking on the links below.